ebruary 12
Advancing Geological Carbon Storage Monitoring With 3d Digital Shadow Technology
Gahlot, Abhinav Prakash, Orozco, Rafael, Herrmann, Felix J.
Geological Carbon Storage (GCS) is a key technology for achieving global climate goals by capturing and storing CO2 in deep geological formations. Its effectiveness and safety rely on accurate monitoring of subsurface CO2 migration using advanced time-lapse seismic imaging. A Digital Shadow framework integrates field data, including seismic and borehole measurements, to track CO2 saturation over time. Machine learning-assisted data assimilation techniques, such as generative AI and nonlinear ensemble Bayesian filtering, update a digital model of the CO2 plume while incorporating uncertainties in reservoir properties. Compared to 2D approaches, 3D monitoring enhances the spatial accuracy of GCS assessments, capturing the full extent of CO2 migration. This study extends the uncertainty-aware 2D Digital Shadow framework by incorporating 3D seismic imaging and reservoir modeling, improving decision-making and risk mitigation in CO2 storage projects.
Enhancing Robustness Of Digital Shadow For CO2 Storage Monitoring With Augmented Rock Physics Modeling
Gahlot, Abhinav Prakash, Herrmann, Felix J.
To meet climate targets, the IPCC underscores the necessity of technologies capable of removing gigatonnes of CO2 annually, with Geological Carbon Storage (GCS) playing a central role. GCS involves capturing CO2 and injecting it into deep geological formations for long-term storage, requiring precise monitoring to ensure containment and prevent leakage. Time-lapse seismic imaging is essential for tracking CO2 migration but often struggles to capture the complexities of multi-phase subsurface flow. Digital Shadows (DS), leveraging machine learning-driven data assimilation techniques such as nonlinear Bayesian filtering and generative AI, provide a more detailed, uncertainty-aware monitoring approach. By incorporating uncertainties in reservoir properties, DS frameworks improve CO2 migration forecasts, reducing risks in GCS operations. However, data assimilation depends on assumptions regarding reservoir properties, rock physics models, and initial conditions, which, if inaccurate, can compromise prediction reliability. This study demonstrates that augmenting forecast ensembles with diverse rock physics models mitigates the impact of incorrect assumptions and improves predictive accuracy, particularly in differentiating uniform versus patchy saturation models.
PyPotteryInk: One-Step Diffusion Model for Sketch to Publication-ready Archaeological Drawings
Archaeological ceramics are a valuable source of information for reconstructing the customs, exchanges and social relationships of ancient populations, as well as for dating archaeological contexts (Sinopoli 1991; Peroni 1994; Steiner and Allason-Jones 2005; Vidale 2007; Orton and Hughes 2013; Hunt 2016). However, in order to turn a ceramic fragment into a rich source of scientific information, a long process of study and elaboration is required: once recovered in an excavation, the ceramic fragment is washed, catalogued, drawn and made ready for publication through the preparation of tables and figures that allow its correct interpretation and comparison with other archaeological contexts. Archaeological drawing is a fundamental and well-established tool in archaeological practice, and new technologies and methods are emerging to automate, standardise and speed up this process as much as possible. An example of this is the LAD (Laser Aided Profiler - Demjรกn, Pavรบk, and Roosevelt 2023), a tool that allows ceramic fragments to be'drawn' quickly and accurately using a laser beam. Over time, however, many drawings were made by hand using traditional tools such as pencils and then had to be'inked' and made ready for publication. Traditionally, this post-process was done by hand with Indian ink, and nowadays digital drawing programmes are used. This process is however extremely time-consuming and can often discourage the publication of new contexts due to the difficulties in terms of time and resources needed for inking. Generative AI can help to achieve this task, using complex image translation operation. Today, AI is permeating business, creativity and everyday life (Elliott 2019; Le et al. 2020; Varghese, Raj, and Venkatesh 2022; Azatbekova
Neural Network-based Vehicular Channel Estimation Performance: Effect of Noise in the Training Set
Ngorima, Simbarashe Aldrin, Helberg, Albert, Davel, Marelie H.
Vehicular communication systems face significant challenges due to high mobility and rapidly changing environments, which affect the channel over which the signals travel. To address these challenges, neural network (NN)-based channel estimation methods have been suggested. These methods are primarily trained on high signal-to-noise ratio (SNR) with the assumption that training a NN in less noisy conditions can result in good generalisation. This study examines the effectiveness of training NN-based channel estimators on mixed SNR datasets compared to training solely on high SNR datasets, as seen in several related works. Estimators evaluated in this work include an architecture that uses convolutional layers and self-attention mechanisms; a method that employs temporal convolutional networks and data pilot-aided estimation; two methods that combine classical methods with multilayer perceptrons; and the current state-of-the-art model that combines Long-Short-Term Memory networks with data pilot-aided and temporal averaging methods as post processing. Our results indicate that using only high SNR data for training is not always optimal, and the SNR range in the training dataset should be treated as a hyperparameter that can be adjusted for better performance. This is illustrated by the better performance of some models in low SNR conditions when trained on the mixed SNR dataset, as opposed to when trained exclusively on high SNR data.
CAST: Cross Attention based multimodal fusion of Structure and Text for materials property prediction
Lee, Jaewan, Park, Changyoung, Yang, Hongjun, Lim, Sungbin, Han, Sehui
Recent advancements in AI have revolutionized property prediction in materials science and accelerating material discovery. Graph neural networks (GNNs) stand out due to their ability to represent crystal structures as graphs, effectively capturing local interactions and delivering superior predictions. However, these methods often lose critical global information, such as crystal systems and repetitive unit connectivity. To address this, we propose CAST, a cross-attention-based multimodal fusion model that integrates graph and text modalities to preserve essential material information. CAST combines node- and token-level features using cross-attention mechanisms, surpassing previous approaches reliant on material-level embeddings like graph mean-pooling or [CLS] tokens. A masked node prediction pretraining strategy further enhances atomic-level information integration. Our method achieved up to 22.9\% improvement in property prediction across four crystal properties including band gap compared to methods like CrysMMNet and MultiMat. Pretraining was key to aligning node and text embeddings, with attention maps confirming its effectiveness in capturing relationships between nodes and tokens. This study highlights the potential of multimodal learning in materials science, paving the way for more robust predictive models that incorporate both local and global information.
TEE4EHR: Transformer Event Encoder for Better Representation Learning in Electronic Health Records
Karami, Hojjat, Atienza, David, Ionescu, Anisoara
Irregular sampling of time series in electronic health records (EHRs) is one of the main challenges for developing machine learning models. Additionally, the pattern of missing data in certain clinical variables is not at random but depends on the decisions of clinicians and the state of the patient. Point process is a mathematical framework for analyzing event sequence data that is consistent with irregular sampling patterns. Our model, TEE4EHR, is a transformer event encoder (TEE) with point process loss that encodes the pattern of laboratory tests in EHRs. The utility of our TEE has been investigated in a variety of benchmark event sequence datasets. Additionally, we conduct experiments on two real-world EHR databases to provide a more comprehensive evaluation of our model. Firstly, in a self-supervised learning approach, the TEE is jointly learned with an existing attention-based deep neural network which gives superior performance in negative log-likelihood and future event prediction. Besides, we propose an algorithm for aggregating attention weights that can reveal the interaction between the events. Secondly, we transfer and freeze the learned TEE to the downstream task for the outcome prediction, where it outperforms state-of-the-art models for handling irregularly sampled time series. Furthermore, our results demonstrate that our approach can improve representation learning in EHRs and can be useful for clinical prediction tasks.
TimEHR: Image-based Time Series Generation for Electronic Health Records
Karami, Hojjat, Hartley, Mary-Anne, Atienza, David, Ionescu, Anisoara
Electronic health records (EHRs) chart patients' interactions with the health system and contain critical information for improving services and supporting research. Data from these systems are routinely incorporated into machine learning and statistical models for clinical decision support on diagnostic and prognostic predictions, as well as for monitoring health and evaluating treatment response [1]. However, access to large-scale EHR datasets is challenging and governed by strict regulations on privacy and security (e.g. HIPAA and GDPR), meaning that many models are based on unicentric data with a high risk of poor generalizability [2]. Traditional approaches for anonymization can be complex and costly, often compromising the data's statistical integrity and failing to provide robust privacy guarantees [3, 4]. The use of synthetic data is thus emerging as a promising solution for optimizing the trade-off between privacy and statistical utility [5, 6]. Generative models, particularly Generative Adversarial Networks (GANs) [7], have shown great potential in producing distribution-preserving synthetic EHR data.
Lens: A Foundation Model for Network Traffic in Cybersecurity
Wang, Qineng, Qian, Chen, Li, Xiaochang, Yao, Ziyu, Shao, Huajie
Network traffic refers to the amount of data being sent and received over the internet or any system that connects computers. Analyzing and understanding network traffic is vital for improving network security and management. However, the analysis of network traffic is challenging due to the diverse nature of data packets, which often feature heterogeneous headers and encrypted payloads lacking semantics. To capture the latent semantics of traffic, a few studies have adopted pre-training techniques based on the Transformer encoder or decoder to learn the representations from massive traffic data. However, these methods typically excel in traffic understanding (classification) or traffic generation tasks. To address this issue, we develop Lens, a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data. Harnessing the strength of the encoder-decoder framework, which captures the global information while preserving the generative ability, our model can better learn the representations from raw data. To further enhance pre-training effectiveness, we design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP). Evaluation results across various benchmark datasets demonstrate that the proposed Lens outperforms the baselines in most downstream tasks related to both traffic understanding and generation. Notably, it also requires much less labeled data for fine-tuning compared to current methods.
Prompt Design and Engineering: Introduction and Advanced Methods
A prompt in generative AI models is the textual input provided by users to guide the model's output. This could range from simple questions to detailed descriptions or specific tasks. In the context of image generation models like DALLE-3, prompts are often descriptive, while in LLMs like GPT-4 or Gemini, they can vary from simple queries to complex problem statements. Prompts generally consist of instructions, questions, input data, and examples. In practice, to elicit a desired response from an AI model, a prompt must contain either instructions or questions, with other elements being optional. Basic prompts in LLMs can be as simple as asking a direct question or providing instructions for a specific task. Advanced prompts involve more complex structures, such as "chain of thought" prompting, where the model is guided to follow a logical reasoning process to arrive at an answer.
Preventing Extreme Polarization of Political Attitudes
Axelrod, Robert, Daymude, Joshua J., Forrest, Stephanie
Extreme polarization can undermine democracy by making compromise impossible and transforming politics into a zero-sum game. Ideological polarization - the extent to which political views are widely dispersed - is already strong among elites, but less so among the general public (McCarty, 2019, p. 50-68). Strong mutual distrust and hostility between Democrats and Republicans in the U.S., combined with the elites' already strong ideological polarization, could lead to increasing ideological polarization among the public. The paper addresses two questions: (1) Is there a level of ideological polarization above which polarization feeds upon itself to become a runaway process? (2) If so, what policy interventions could prevent such dangerous positive feedback loops? To explore these questions, we present an agent-based model of ideological polarization that differentiates between the tendency for two actors to interact (exposure) and how they respond when interactions occur, positing that interaction between similar actors reduces their difference while interaction between dissimilar actors increases their difference. Our analysis explores the effects on polarization of different levels of tolerance to other views, responsiveness to other views, exposure to dissimilar actors, multiple ideological dimensions, economic self-interest, and external shocks. The results suggest strategies for preventing, or at least slowing, the development of extreme polarization.